Unsupervised Data Partitioning: a Bayesian Approach

نویسندگان

  • Stephen J. Roberts
  • Iead Rezek
چکیده

A Bayesian-based methodology is presented which automatically penalises over-complex models being tted to unknown data. We show that, with a Gaussian mixture model, the approach is able to select anòptimal' number of components in the model and so partition data sets.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

High-Dimensional Unsupervised Active Learning Method

In this work, a hierarchical ensemble of projected clustering algorithm for high-dimensional data is proposed. The basic concept of the algorithm is based on the active learning method (ALM) which is a fuzzy learning scheme, inspired by some behavioral features of human brain functionality. High-dimensional unsupervised active learning method (HUALM) is a clustering algorithm which blurs the da...

متن کامل

Refining A Divisive Partitioning Algorithm for Unsupervised Clustering

The Principal Direction Divisive Partitioning (PDDP) algorithm is a fast and scalable clustering algorithm [3]. The basic idea is to recursively split the data set into sub-clusters based on principal direction vectors. However, the PDDP algorithm can yield poor results, especially when cluster structures are not well-separated from one another. Its stopping criterion is based on a heuristic th...

متن کامل

Using Bayesian Blocks to Partition Self-Organizing Maps

Self organizing maps (SOMs) are widely-used for unsupervised classification. For this application, they must be combined with some partitioning scheme that can identify boundaries between distinct regions in the maps they produce. We discuss a novel partitioning scheme for SOMs based on the Bayesian Blocks segmentation algorithm of Scargle [1998]. This algorithm minimizes a cost function to ide...

متن کامل

Multivariate Data Grid Models for Supervised and Unsupervised Learning Note technique

This paper introduces a new method to automatically, rapidly and reliably evaluate the class conditional information of any subset of variables in supervised learning. It is based on a partitioning of each input variable, into intervals in the numerical case and into groups of values in the categorical case. The cross-product of the univariate partitions forms a multivariate partition of the in...

متن کامل

Unsupervised Coreference Resolution with HyperGraph Partitioning

Unsupervised-learning based coreference resolution obviates the need for annotation of training data. However, unsupervised approaches have traditionally been relying on the use of mention-pair models, which only consider information pertaining to a pair of mentions at a time. In this paper, it is proposed the use of hypergraph partitioning to overcome this limitation. The mentions are modeled ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007